Application of latent semantic analysis for open-ended responses in a large, epidemiologic study

نویسندگان

  • Travis D Leleu
  • Isabel G Jacobson
  • Cynthia A LeardMann
  • Besa Smith
  • Peter W Foltz
  • Paul J Amoroso
  • Marcia A Derr
  • Margaret AK Ryan
  • Tyler C Smith
چکیده

BACKGROUND The Millennium Cohort Study is a longitudinal cohort study designed in the late 1990s to evaluate how military service may affect long-term health. The purpose of this investigation was to examine characteristics of Millennium Cohort Study participants who responded to the open-ended question, and to identify and investigate the most commonly reported areas of concern. METHODS Participants who responded during the 2001-2003 and 2004-2006 questionnaire cycles were included in this study (n = 108,129). To perform these analyses, Latent Semantic Analysis (LSA) was applied to a broad open-ended question asking the participant if there were any additional health concerns. Multivariable logistic regression was performed to examine the adjusted odds of responding to the open-text field, and cluster analysis was executed to understand the major areas of concern for participants providing open-ended responses. RESULTS Participants who provided information in the open-ended text field (n = 27,916), had significantly lower self-reported general health compared with those who did not provide information in the open-ended text field. The bulk of responses concerned a finite number of topics, most notably illness/injury, exposure, and exercise. CONCLUSION These findings suggest generalized topic areas, as well as identify subgroups who are more likely to provide additional information in their response that may add insight into future epidemiologic and military research.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Analytic Techniques in Survey Questionnaire Development and Analysis

This research develops three text analytic techniques to improve survey questionnaires. The first is open-ended response mining. Narrative responses on a survey are mined for themes then used to develop new questions. Closed-ended responses identify subgroups who agree/disagree with the question. Then open-ended responses examined for systematic differences which suggest new constructs that dis...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

Query expansion based on relevance feedback and latent semantic analysis

Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...

متن کامل

TOEFL iBT integrated speaking tests: a comparison of test-takers' performance in terms of complexity, accuracy, and fluency

This study compares three integrated tasks of the TOEFL iBT speaking subtest in terms of complexity, accuracy, and fluency. To this end, a group of TOEFL iBT Iranian candidates took a simulated TOEFL iBT some days prior to their real exam. The collected oral responses were first transcribed and then quantified using software such as ‘Syllable Counter’ and ‘Coh-Metrix3’ for fluency and complexit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2011